Summarization via Pattern Utility and Ranking: A Novel Framework for Social Media Data Analytics

نویسندگان

  • Xintian Yang
  • Yiye Ruan
  • Srinivasan Parthasarathy
  • Amol Ghoting
چکیده

The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. The data can be classified into two categories: the textual content written by the users and the topological structure of the connections among users. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this article, we present a dynamic pattern driven approach to summarize social network content and topology. The resulting family of algorithms relies on the common principles of summarization via pattern utilities and ranking (SPUR). SPUR and its dynamic variant (D-SPUR) relies on an in-memory summary while retaining sufficient information to facilitate a range of user-specific and topic-specific temporal analytics. We then follow up by describing variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR). Finally we describe scalable algorithms for implementing these ideas on a commercial GPU-based systems. We examine the effectiveness of the summarization approaches along the axes of storage cost, query accuracy, and efficiency using real data from Twitter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Timeline Summarization from Social Media with Life Cycle Models

The popularity of social media shatters the barrier for online users to create and share information at any place at any time. As a consequence, it has become increasing difficult to locate relevance information about an entity. Timeline has been proven to provide an effective and efficient access to understand an entity by displaying a list of episodes about the entity in chronological order. ...

متن کامل

Adaptive Big Data Analytics for Deceptive Review Detection in Online Social Media

The explosive growth of user-contributed reviews in e-Commerce and online social network sites prompts for the design of novel big data analytics frameworks to cope with such a challenge. The main contributions of our research are twofold. First, we design a novel big data analytics framework that leverages distributed computing and streaming to efficiently process big social media data streams...

متن کامل

Time Aware Knowledge Extraction for Microblog Summarization on Twitter

Microblogging services like Twitter and Facebook collect millions of user generated content every moment about trending news, occurring events, and so on. Nevertheless, it is really a nightmare to find information of interest through the huge amount of available posts that are often noise and redundant. In general, social media analytics services have caught increasing attention from both side ...

متن کامل

Text Analytics to Support Sense-making in Social Media: A Language-Action Perspective

Social media and online communities provide organizations with new opportunities to support their business-related functions. Despite their various benefits, social media technologies present two important challenges for sense-making. First, online discourse is plagued by incoherent, intertwined conversations that are often difficult to comprehend. Moreover, organizations are increasingly inter...

متن کامل

Interactive Folksonomic Analytics with the Tag River Visualization

Tag River is a novel visualization that presents a detailed comparative overview between user content for a particular span of time. Simultaneously it provides a trend summarization of earlier or later time spans. The summarization is displayed using vertically-adjacent polygonal regions in which the area represents some facet of quantitative information. A series of animated tag clouds are use...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Data Eng. Bull.

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2013